# High-precision action recognition
Videomae Base Finetuned Kinetics Violence Nonviolence Tuned
A video classification model based on the VideoMAE architecture, specifically fine-tuned for violence and non-violence scene classification tasks
Video Processing
Transformers

V
cliffer1
56
0
Xclip Large Patch14 Kinetics 600
MIT
X-CLIP is an extended version of CLIP for general video-language understanding, trained on video-text pairs through contrastive learning.
Text-to-Video
Transformers English

X
microsoft
124
5
Featured Recommended AI Models